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Abstract 

We consider the problem of information aggregation in sensor networks, where one is interested in 
t/3 ' computing a function of the sensor measurements. We allow for block processing and study in-network 

function computation in directed graphs and undirected graphs. We study how the structure of the 
function affects the encoding strategies, and the effect of interactive information exchange. Depending 
, on the application, there could be a designated collector node, or every node might want to compute 

^ ' the function. 

We begin by considering a directed graph G = {'fjS') on the sensor nodes, where the goal is 
^iij^ ' to determine the optimal encoders on each edge which achieve function computation at the collector 

I node. Our goal is to characterize the rate region in rI'^I, i.e., the set of points for which there exist 

feasible encoders with given rates which achieve zero-error computation for asymptotically large block 
] length. We determine the solution for directed trees, specifying the optimal encoder and decoder for 

each edge. For general directed acyclic graphs, we provide an outer bound on the rate region by finding 
the disambiguation requirements for each cut, and describe examples where this outer bound is tight. 

Next, we address the scenario where nodes are connected in an undirected tree network, and every 
node wishes to compute a given symmetric Boolean function of the sensor data. Undirected edges 
permit interactive computation, and we therefore study the effect of interaction on the aggregation and 
I communication strategies. We focus on sum-threshold functions, and determine the minimum worst-case 

total number of bits to be exchanged on each edge. The optimal strategy involves recursive in-network 
aggregation which is reminiscent of message passing. In the case of general graphs, we present a cut- 
set lower bound, and an achievable scheme based on aggregation along trees. For complete graphs, we 
prove that the complexity of this scheme is no more than twice that of the optimal scheme. 
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I. INTRODUCTION 

Sensor networks are composed of nodes with sensing, wireless communication and com- 
putation capabilities. These networks are designed for applications like fault monitoring, data 
harvesting and environmental monitoring; tasks which can be broadly classified as information 
aggregation. In these applications, one is interested only in computing some relevant function 
of the sensor measurements. For example, one might want to compute the mean temperature 
for environmental monitoring, or the maximum temperature in fire alarm systems. This suggests 
moving away from a data-forwarding paradigm, and focusing on efficient in-network computation 
and communication strategies for the function of interest. This is particularly important since 
sensor nodes may be severely limited in terms of power and bandwidth, and can potentially 
generate enormous volumes of data. 

There are two possible architectures for sensor networks that one might consider. First, one 
could designate a single collector node/fusion center which seeks to compute the function. This 
goal is more appropriate for data harvesting and centralized fault monitoring. Alternately, one 
could suppose that every node in the network wants to compute the function. The latter goal can 
be viewed as providing situational awareness to each sensor node, which could be very useful 
in applications like distributed fault monitoring, adaptive sensing and sensor-actuator networks. 
For example, sensor nodes might want to modify their sampling rate depending on the value of 
the function. We will consider both these problems. 

In order to make progress on the general problem of computing functions of distributed data, 
we will study specific network topologies and some specific classes of functions. In this paper, 
we abstract out the medium access control problem associated with a wireless network, and 
view the network as a graph with edges representing noiseless links. The fundamental challenge 
is to exploit the structure of the particular function, so as to optimally combine transmissions 
at intermediate nodes. Thus, the problem of function computation could be regarded as being 
more general than finding the capacity of a wireless network. In our problem formulation, we 
consider the zero error block computation framework. We allow for nodes to accumulate a 
block of measurements and realize greater efficiency using block coding strategies. However, we 



require the function to be computed with zero error for the block. To solve the problem under 
this framework, one needs to determine the optimal strategy for communication and computation, 
which includes determining the order in which nodes should transmit and the information that 
nodes must convey whenever they transmit. The strategy for computation may benefit from 
interactive information exchange between nodes, which presents an additional degree of freedom 
vis-a-vis the standard point-to-point communication set-up. 

In Section Unl we view the network as a directed graph with edges representing noiseless 
links. We thus consider the problem of general function computation in a directed graph G = 
(y , (?) with a designated collector. We focus specifically on strategies for combining information 
at intermediate nodes, and optimal codes for transmissions on each edge. We consider both 
the worst case and the average case complexity for zero error block computation with a joint 
probability distribution on the node measurements. Our goal is to characterize the rate region 
in rI'^I, i.e., the set of points for which there exist feasible encoders with given rates which 
achieve zero-error computation for large enough block length. In the case of tree graphs, we 
derive a necessary and sufficient condition for the encoder on each edge, which provides a 
complete characterization of the rate region. The extension of these results to directed acyclic 
graphs is more difficult. However, we provide an outer bound on the rate region by finding the 
disambiguation requirements for each cut, and describe examples where this outer bound is tight. 

In Section |IVl we address the problem of computing symmetric Boolean functions in undi- 
rected graphs. The key difference from Section III] is that we consider bidirectional links and 
study the benefit of interaction between nodes. We show how the approach described in Section 
Hm together with ideas from communication complexity theory, can be synthesized to develop 
a theory of optimal computation of symmetric Boolean functions in undirected graphs. In the 
case of tree networks, each edge is a cut-edge, and this allows us to derive a lower bound on the 
number of bits exchanged on each edge, by considering an equivalent two node problem. Further, 
we show that a protocol of recursive in-network aggregation along with a smart interactive coding 
strategy, achieves this lower bound for the class of sum-threshold functions in tree networks. The 
optimal strategy has a simple structure that is reminiscent of message passing, where messages 



flow from the leaves towards an interior node, and then flow back from the interior node to 
the leaves. In the case of general graphs, we present a cut- set lower bound, and an achievable 
scheme based on aggregation along trees. For complete graphs, we show that the complexity of 
this scheme is no more than twice that of the optimal scheme. 

II. RELATED WORK 

In its simplest form, the problem of network function computation can be modeled as a problem 
of computation on graphs obtained by abstracting out the medium access control problem and 
channel noise. This problem is closely related to the network coding problem. Indeed, assuming 
independent measurements Xi and the identity function /(xi, jc2, . . . ,x„) = {xi,X2, ■ ■ ■ ,x„), we 
have the reverse of the multicast problem studied in [H]. Computing a function of independent 
measurements is a network computation problem as opposed to a network coding problem. In 
II2I, the min-cut upper bound on the rate of computation is shown to be tight for the computation 
of divisible functions on tree graphs. In this paper, we generalize this result using a different 
approach. Further, the simplicity of the approach presented allows extensions to the case of 
general graphs and collocated networks. 

The problem of worst-case block function computation was formulated in [|3]. The authors 
determine the maximum rate at which a symmetric function can be computed in a random 
network, given the constraints of the wireless medium. They identify two classes of symmetric 
functions namely type-sensitive functions exemplified by Mean and Median, and type-threshold 
functions, exemplified by Maximum and Minimum. The maximum rates for computation of 
type-sensitive and type-threshold functions in random planar networks are shown to be ©(j^) 
and Q( iogiog« ) respectively, for a network of n nodes. A communication complexity approach 
was used to establish upper bounds on the rate of computation in collocated networks. Some 
extensions to the case of finite degree graphs are presented in 

In the study of the communication complexity of multi-party computation Q, one seeks to 
minimize the number of bits that must be exchanged in the worst case between two nodes 
to achieve zero-error computation of a function of the node variables. The communication 



complexity of Boolean functions has been studied in [[61, |I71. Further, one can consider the 
direct-sum problem lO where several instances of the problem are considered together to obtain 
savings. This block computation approach is used to compute the exact complexity of the Boolean 
AND function in |[9l. In this paper, we considerably generalize this result, which allows us to 
derive optimal strategies for computing more general classes of symmetric Boolean functions in 
undirected tree networks. The optimal communication scheme is reminiscent of message passing 
algorithms which have been applied very effectively to the problems of computing marginals 
and probabilistic inference [[TOl . [[TT|. 

An information-theoretic formulation of this problem combines the complexity of source 
coding of correlated sources with rate distortion, together with the complications introduced 
by the function structure; see [|3l. There is little or no work that addresses this most general 
framework. The problem of source coding with side information has been studied for the 
vanishing error case in [|T2l . This has been extended in [[T3l to the case where the receiver 
desires to know a certain function f{X,Y) of the single source X and the side information F; 
the authors determined the required capacity of the channel between the source and receiver to 
be the conditional graph entropy. However, the extension to larger networks has proved difficult. 
In Zero-error Information Theory, the problem of source coding with side information ensuring 
zero error for finite block length has been studied in [|l4l and ifTSl . The problem reduces to 
the task of coloring a probabilistic graph defined on the set of source samples. The minimum 
entropy of such a coloring approaches the graph entropy or Komer entropy, as the block length 
approaches infinity. Recently, the rate region for multi -round interactive function computation 
has been characterized for two nodes [|T6l , and for collocated networks [|T71 . 

In this paper we do not address the problem of function computation in noisy networks. 
In [[TSl . the problem of computing parity in a collocated network in the presence of noise is 
considered. It is shown that O(nloglogn) bits suffice to achieve correct computation with high 
probability. This has been extended to random planar networks in |fT9l , where the same log log n 
factor of redundancy is shown to be sufficient. Remarkably, this factor was recently shown to 
be tight in [|20l. 



III. Function Computation in Directed Graphs 

In this section, we abstract out the medium access control problem associated with a wireless 
network, and view the network as a directed graph with edges representing essentially noiseless 
wired links between nodes. We formulate the problem of zero error function computation on 
graphs. We suppose that there is a joint probability distribution on the node measurements, 
and allow nodes to realize greater efficiency by using block codes. We will consider both the 
worst case and the average case complexity for zero error block computation. Given a graph, 
the problem we address is to determine the set of rates on the edges which will allow zero 
error function computation for a large enough block length. In essence, we are exploring the 
interaction between the function structure and the structure of the graph; how information needs 
to be routed and combined at intermediate nodes to achieve certain rate vectors. 

In Section IIII-Al we begin with the two node problem. We compute the number of bits that 
node vx needs to communicate to node vy so that the latter can compute a function f{X,Y) 
with zero error. For correct function computation, an encoder must disambiguate certain pairs of 
source symbols of node vx, on which the function disagrees. We show by explicit construction 
of a code that this necessary condition is in fact sufficient. This yields the optimal alphabet and 
we calculate the minimum worst case and average case complexity, with the latter obtained by 
Huffman coding over the optimal alphabet. In Section IIII-Bl we extend this result to directed 
trees with the collector as root, exploiting the fact that each edge is a cut-edge. This yields the 
optimal alphabet for each edge, and we separately optimize the encoders for the worst case and 
the average case. Thus the rate region consists of all rate points dominating a single point that 
is coordinate-wise optimal. 

In Section IIII-Cl we consider directed acyclic graphs. A key difference from the tree case is 
the presence of multiple paths to route the data, which present different opportunities to combine 
information at intermediate nodes. We arrive at an outer bound to the rate region by finding the 
disambiguation requirements for each cut of the directed graph. This outer bound is not always 
tight, as we show in Example [3l However, for the worst case computation of finite field parity, 
and the maximum or minimum functions, the outer bound is shown to be indeed tight. Further, 



the only extreme points of the rate region are rate points corresponding to activating only a tree 
subset of edges. 

A. Two Node Setting 

1 ) Worst case complexity: We begin by considering the simple two node problem. Suppose 
nodes vx and vy have measurements x G <^ and y E where the alphabets ^ and ^ are 
finite sets. Node vx needs to optimally communicate its information to node vy so that a function 
f{x,y), which takes values in ^, can be computed at vy with zero error. We do not consider 
the case where vx and vy interactively compute the function. Thus node vx has an encoder 

: ^ — 7- {0, 1}*, which maps its measurement x to the codeword '^{x), and node vy has a 
decoder g : {0,1}* x '3^ —^Q) which maps the received codeword "^{x) and its own measurement 
3; to a function estimate, g(^{x),y^. The set of all possible codewords is called the codebook, 
denoted by ^( JT) 

Definition 1 {Feasible Encoder): An encoder ^ \% feasible if there exists a decoder g : {0, 1}* x 
^ Qi such that g{^{x),y) = f{x,y) for all {x,y) G ^ x Thus, a feasible encoder is one 
that achieves error-free function computation. 

Theorem 1 (Characterization of Feasible Encoders): An encoder ^ is feasible if and only if 
given x\x^ e JT, "^{x^) = ^(jc^) implies f{x\y) = f{x^,y) for all ye^. 
Proof: By definition, if ^ is a feasible encoder, then there exists a corresponding decoder g such 
that g{'^{x^),y) = f{x\y) and g{'^{x^),y) = f{x^,y), for all 3; e ^. Further, if ^(x^) = ^(x^), 
we have f{x^,y) = f{x^,y) for all y e^V. 

To prove the converse, we need to construct a decoding function g : {0,1}* x '3^ —^Qi. For 
each codeword C* in the codebook, define ^-^(C*) := {x G ^ : '^(jc) = C*}. For fixed y G ^ 
and fixed codeword C* G ^{2^), the decoder mapping is given by g{C* ,y) := f{x"""^{C*),y) 
for any arbitrary x"""'^{C*) G '^^^(C*). We show that this decoder works for any fixed x and 
y . Indeed, g{'t^{x),y) = f{x"""\y) where x"""' G ^-^C^W). Thus, = and by 

assumption f{x"""\y) = f{x,y). Hence, g{'<^{x),y) = f{x"""\y) = f{x,y) for all ye^.a 



Any feasible encoder ^ can be viewed as partitioning the set !^ into n(^) := {5i,S2, . . --,5]^ 
such that for G C,-,x^ G Cj, we have '^(x^) = ^(jc^) if and only if i = j. Define an equivalence 
relation between x^,x^ G by: 

O if and only if f{x^,y) = f{x^,y) for all y 

Consider the encoder ^^^^ which assigns a distinct codeword to each resulting equivalence 
class. Clearly, ^^^^ is a feasible encoder, since '^'^^^(jc^) = implies x^ -v^ x^, and 

hence f{x^,y) = f{x^,y) for all y G ^^^^ is optimal in the sense that any other feasible 
encoder ^ must have at least as many codewords as 

Theorem 2 (Optimality of^^^^ ): Let n{'^0^^) := {Sf^^,S^^^,...,Sf^^} be the partition 
of ^ generated by and let n(^) := {5i,52, . . . ,5/} be the partition of generated by 

any other feasible encoder ^. Then, 

(i) n(^) must be a finer partition than n(^'^^^). 

(ii) The minimum number of bits that node vx needs to communicate is [log |n(^^^"^)|]. 
Proof: First we claim that any subset Sj G n(^) can have nonempty intersection with exactly 
one subset S^^^ G n(^'^^^). Suppose not. Then there exist x^jc^ G 5/ such that x^ G 5^^^ 
and x^ G S^f^. Since ^(x^) = ^{x^), by Theorem [T] we must have f{x^,y) = f{x^,y) for all 
y E^^. However, by construction of ^'^^^ , x^ and x^ must belong to distinct equivalence classes 
i.e., x^ ^x^. Hence, there exists y* such that f{x^^y*) ^ f{x^,y*), which is a contradiction. 
This shows that the partition generated by any encoder ^ must be a further subdivision of the 
partition generated by ^^^^ , i.e., finer than n(^'^^^). So node vx needs to communicate at 
least [log|n(<^^^^)n bits. □ 

We can extend this to the case where vx collects a block of N measurements x={xi,X2,..., xpj) G 
and vy collects a block of A'^ measurements y = (yi,y2, . . ■ ,yA^) G . We want to find a 
block encoder : — > {0, 1}* so that the vector function f^^\x,y) = (/(jci,yi), . . . ,/(-v:a?,3'a?)) 
can be computed without error, for all x G ,y G . All the above results carry over to the 
error- free block computation case. As before, we define an equivalence ^ between G 2^^ 
if /(^)(^^ = f^^\i.y) for all y G The optimal encoder ^^'^pr 

is once again obtained 



by assigning distinct codewords to each equivalence class. Since we are stringing together A'^ 
independent instances, we have \ = |n(^^^^)|^. Hence the minimum number of 

bits per computation that node vx needs to communicate is ^^'"^1^^ — ILL which converges to 
log|n(^^^^)| as iV^oo. 

2) Average case complexity: Suppose now that the measurements X, 7 are drawn from the 
joint probability distribution p{X,Y), with the goal being to minimize the average number of 
bits that need to be communicated, i.e., the average case complexity. 

Definition 2 (Feasible Encoder): An encoder 'i^ : ^ — )• {0, 1}* is feasible if there exists a 
decoder g : {0, 1}* x ^ ^ ^ such that g{'^{x),y) = f{x,y) for all G ^ x ^ : p{x,y) > 0}. 

Theorem 3: An encoder ^ is feasible if and only if, given x^,x^ E ^{x^) = '^{xi) implies 
f{xuy) = fi^i.y) for {ye^: p{xuy)p{x2,y) > 0}. 

Proof: By definition, if is a feasible encoder, then there exists a corresponding decoder g 
such that g{^{x'),y)=f{x\y) and g{^{x^),y) = f{x\y), for all {y G W : p{x\y)p{x\y) > 0}. 
Further, if ^(jc^) = ^(jc^), we have f{x\y) = f{x^,y) for {y G ^ : p{x\y)p{x^ ,y) > 0}. 

To prove the converse, we need to construct a decoding function ^ : {0, 1}* x ^ — ^. For 
each codeword ^* in the codebook, define ^^^{C*) := {x E -.^{x) =C*}. For fixed y E'W 
and fixed codeword G 'i^(^), the decoder mapping is given by g(C*,y) := /(jc"""'(C*,j),y) 
for any arbitrary y^^""\C* ,y) E ^"^(C*) with j!7(jc"'""(C*,y),3;) > 0. We show that this decoder 
works for any fixed x and y with pix.y) > 0. Indeed, g{^{x),y) = f{x"""\y) where x"""'' E 
'rf-'^{'tf{x)) with p{x"""\y) > 0. Thus, 'rf{x'"'"') = ^{x) and by assumption f{x"""\y) = f{x,y) 
since p{x"""\y)p{x,y) > 0. Hence, g{^{x),y) = /(x^^'^.y) = f{x,y). □ 

We now define ''x^ -^x^" when f{x^,y) = f{x^,y) for {y E'3/' : p{x^ ,y)p{x^ ,y) > 0}. Now the 
^ relation is reflexive and symmetric, but not necessarily transitive. However, if p{x,y) > for 
all {x,y) E x^, then ^ is an equivalence relation. We can construct an encoder ^^^^ which 
assigns a distinct codeword to each equivalence class. Let Tl{'^^^^) := {S^^^jSj^^, . . . ,5^^^} 
be the partition of ^ generated by '^'^^^ . Analogous to Theorem |2l we can show that the encoder 
'^^^^ has the optimal alphabet , with the probability distribution vector q = {qi,q2,...,qk} 
where qi := L^e5p/'7- Ly£^p(jc, j). 



Once the optimal alphabet is fixed, the optimal code ^^^^ is the binary Huffman code for 
the probability vector q. Since the Huffman code has an average code length within one bit of 
the entropy, 

H{quq2,...,qk)<E[l{'^^''^)]<H{qi,q2,...,qk) + '^- 

The extension to the case where nodes vx,vy collect a block of i.i.d. measurements is 
straightforward. The optimal alphabet is £/^, which has the product distribution q^ . The optimal 
encoder is obtained via the Huffman code for the optimal alphabet. Its expected length satisfies 

N - N - N ' 

Hence the minimum number of bits per computation that node vx needs to communicate 
converges to H{q) as N ^ oo. 

B. Function Computation in Directed Trees 

Let us now consider computation on a tree graph. Consider a directed tree G = {i^ with 
nodes := {vi, V2, . . . , v,,} and root node vi. Edges represent communication links, so that node 
Vj can transmit to node v/ if (vj,v/) G S". Each node v,- makes a measurement Xj G J^i, and 
the collector node vi wants to compute a function /(xi,;c2, ... with no error. We seek to 
minimize the worst case complexity on each edge. 

For each node /, let n{vi) be the unique node to which node / has an outgoing edge, and let 
c/r^(v,) := {vj G : (vy,v,) G The height of a node v,- is the length of the longest directed 
path from a leaf node to v,. Define the descendant set D{vi) to be the subset of nodes in Y from 
which there exist directed paths to node v,. The graph induced on D{vi) is a tree with node v, 
as root. Each node transmits exactly once and the computation proceeds in a bottom-up fashion, 
starting from the leaf nodes and proceeding up the tree. 

Each leaf node v, has an encoder J^'i^ {0,1}* that maps its measurement Xj to a codeword 
%{xi) which is transmitted on the edge (v,, ;r(v/)). Each non-leaf node Vj for j 7^ 1 has an 
encoder '^j which maps its measurement Xj as well as the codewords received from jV^ivj), to 



a codeword transmitted on the edge {vj,7t{vj)). Thus the computation proceeds in a bottom-up 
fashion. Let C, denote the codeword transmitted by node v/, and Cs '■= {Q : v/ G 5} denote the 
set of codewords transmitted by nodes in S. 

Definition 3: A set of encoders {% : 2 < i < n} is said to be feasible if there is a decod- 
ing function gi at the collector node vi such that ^(.^i,K/r-(vi)) = /(-^b-^i, ■ ■ • for all 

[xi,X2, ■ ■ ■ ,X,i) G X ^2 X ... X ^fi. 

Lemma 1: If a set of encoders : 2 < i < n} is feasible, then the encoder % at node 
must separate^ jc])(^^„-| G ^D{vi) from -^^(,,.) £ <^(vi)' if there exists an assignment Xy^^j^^^,-^ such 

that /(^])(,,.),^^;/\£,(,,.)) 7^/(^I)(vi)'^r\Z)(v;))- 

Proof: The removal of edge {vi,n{vi)) separates the graph into two disconnected subtrees D{vi) 
and y\D{vi). We combine all the nodes in Z)(v,) into a supemode v^, and all the nodes in 
y\D{vi) into a supemode v^. The result now follows from Theorem [1] □ 

To prove the converse, we explicitly define the encoders ^2, ^3, • ■ ■ , and a decoding function 
g, and prove that it achieves correct function computation. Define the alphabet for encoder % 
on edge (v,-, ;/r(v,)) as, 

:= {h, : JV\o(v,.) ^ ^ s. t. 34(„.) G 

Thus codewords sent by node v,- can be viewed as normal forms on variables X^-y^,^,,.), or as 
partial functions on Xy\^D(^^,.y 

Encoder at node v,-: On receiving the codeword corresponding to hj : ^r\D{vj) ~^ on 
incoming edge (vy,v,), node v,- assigns nominal values, -^^1'"/) to variables -^d(vj) such that 

fi^D{vj)^^r\D{vj)) = hjixy\D{vj)) ^X-r\D{vj) ^ ^r\D{vj)- (1) 

Given nominal values for all nodes in D{vi) \ {v/}, and its own measurement Xi, node v,- substitutes 



'Node 1'/ does not have access to xq^^,.-^ directly but only the codewords received from ..A' (v/). When we say that the 
encoder % must separate x^j^j,,^ we are considering % as an implicit function of x^,^,,,^. 



these values to obtain a function hi : ^i/\D{vi) ~^ ^ such that 

hi{xy\D{vi)) =/(^D('")\{v,}'^''^r\0(v;)) ^'r\D{vi) ^ ^y\D{vi)- 

If V, 7^ vi, node then transmits the codeword % corresponding to function hi E s^i on the 
edge {yi,%{y^)) . 

Decoding function g: The collector node vi assigns nominal values to the variables ^D(y{)\{v\)- 
The decoding function g is given by g(xi,C^-(^,j)) :=/zi = f{xxXu(l,,)\{y,^■ 
^heorem 4: Let x[",X2", . . . be any fixed assignment of node values. Let the encoders at 
node V2,V3, . . . , v„ be as above. Then function hi computed by node v, is, 

Consequently the decoding function g satisfies g{x^^''^,'rfj^-^^,^^) = /(jc-['''^,.x:^", . . . 
Proof: The proof proceeds by induction. The theorem is trivially true for all leaf nodes v,, since 
by assumption hi{xy\D{^,.^) = f{x(,'^ ,X f\£,(^y.)) for all xy\jj(^y.) G ^r\D{vi)- Suppose it is true for 
all nodes with height less than k. Consider a node v, with height K. All the nodes in ^^(v,) 
must have height less than K. On receiving the codeword corresponding to hj on edge (vj,v,), 
node Vi assigns nominal values to variables in ^d(vj) so that ^ is satisfied. From the induction 
assumption, we have 

Since ^ is true for all Vj E o/K^(v,), we can simultaneously substitute the nominal values 

fix 

^D('v )\{v } '•^^ variables ^d(v,)\{v,} the value x- ' for the variable ^{v,}^ to obtain a function 
hi satisfying 

hi{xy\D{v.)) = /(^fl(v(0)\{v,•}'^^'^■^no(v,■)) ^^-rXDivi) 
where ^ follows from ([1) and This establishes the induction step and completes the proof. 



For the special case of the collector node v,, we have 

Since this is true for every fixed assignment of the node values, we can achieve error-free 
computation of the function. Hence the set of encoders described above is feasible. □ 

For node v,-, consider the equivalence relation where jc]^^^,.^ '^i^D{v) /(-^£)(v )'-^^>'\o(v,)) = 
•^(^Z)(v ^y\D{vi) £ ^y\D{vi)- It is easy to check that the equivalence classes 
generated by are captured exactly by the alphabet J2^. Thus the above encoders use exactly 
the optimal alphabet. Hence, the minimum worst case complexity for encoder % is [log(|j2^|)] 
on the edge (v,,;r(v/)). 

The extension to the case where node v, collects a block of A'^ independent measurements 
X_j G , and the collector node vi wants to compute the vector function f^'^^KiiK.i^ ■ ■ ■ ^K.n)' 
is straightforward. We can thus achieve a minimum worst case complexity arbitrarily close to 
log|,!2^| bits for encoder %. It should be noted that the minimum worst case complexity of 
encoder % does not depend on the encoders of the other nodes. 

If there is a probability distribution p(Xi,X2, . . .,X„) on the measurements, then we can obtain 
a necessary and sufficient condition by considering all edge cuts. 

Lemma 2: Consider a cut which partitions the nodes into S and Y\S with vi G ^\S. Let 
5^ (5) be the set of all edges from nodes in S to nodes in y\S. Then the set of encoders 
{% : 2 < z < n} is feasible if and only if for every cut, the encoder on at least one of the edges in 
5+ (5) separates x\,x^g G if there exists an assignment Xy-^^ such that ^ /(jc|,x*^\^^) 

and p{x\,x*y^^)p{x%x*y^^) > 0. 

Proof: Necessity is as before. For the converse, suppose the set of encoders is not feasible. 
Then there exist assignments and {xl,x?^^^^) such that f{xl,x^^^^^) f{x\,x^^^^^) 

and p{xl,Xys^_^,^)p{xl,Xy\^^^) > 0. However, the codewords received from nodes in ^^(vi) are 
the same for both assignments. For the cut which separates vi from y\vi, there is no encoder 
on 5+ (5) which separates Jf^^^,,^ and -v^^y.^- n 

The above proof of the converse is not constructive. The construction is much harder now 



since the encoders are coupled, as shown by the following example. 

Example 1: Consider the three node network G = {^^(S') with y = {vi,V2,V3} and — 
{(V2,vi),(v3,vi)} (see Figure [T(a)]l. Let Jfj = {x^^}, ^2 = {^^^■«^*}, ^3 = {^^^^^*}• Suppose 
= p{x^'' ,x^^ .r'^) = i The function is given by /(Xi,X2,X3) = {Xi,X2,X3). Con- 
sidering the cut ({v2,V3}, {vi}), either V2 or V3 needs to separate its two values. Thus the two 
encoders are no longer independent. 




(a) 



Fig. 1. Two simple networks of Examples 1 and 2 




(b) 



In general, we can trade off between the encoders on different edges. However, if we assume 
that p{xi,X2, . . . :Xn) > for all {xi,X2t . . ,Xn), we can separately minimize the average description 
length of each encoder. The optimal encoder constructs a Huffman code on the optimal alphabet 
=2^-. Suppose q. is the probability vector induced on the alphabet jz/j. Then, by taking long blocks 
of measurements, we can achieve a minimum average case complexity arbitrarily close to H{q.) 
for encoder %. 



C. Function Computation in Directed Acyclic Graphs 

The extension from trees to directed acyclic graphs presents significant challenges, since there 
is no longer a unique path from every node to the collector. Consider a weakly connected directed 
acyclic graph (DAG) G= where each node v/ collects a block of A'^ measurements 



Xj E 2^^ . The collector node vi is the unique node with only incoming edges, which wants to 
compute the vector function /^(Xi,X2, . . . with zero error. 

Let the encoder mapping on edge (vj,v,) be denoted by ^j^, which maps the measurement 
vector X^j and the codewords received thus far, to a codeword transmitted on edge (vj,v,). 
Since there are no cycles in G, function computation proceeds in a bottom-up fashion. Node v, 
receives codewords '^^^ on each incoming edge (vj,v,) and then transmits a codeword on 
each outgoing edge (v/,Vyt). A set of encoders is said to be feasible if there is a decoding function 
at the collector node vi which maps the received codewords to the correct function value. Let 
^wci'^ij) and lavgi^lj) denote the worst case and average case complexity, respectively, of the 
encoder ^-J. The rate of encoder '^•y' is 



RUK) = J and Ra,M 



Thus we can assign a rate vector in R''^' to every feasible set of encoders. Let I^l^c in the worst 
case (or ^[^'1 in the average case) be the set of feasible rate vectors for encoders of block length 
N. Then the rate region ^vvc (or e^av^) is given by the closure in rI"^' of the finite block length 
rate vectors: 



^wc ■ — "^wc and S^avg '• — ^avg ■ 
N>\ N>\ 

Consider the following example. 

Example 2: We have three nodes {vi,V2,V3} connected as shown in Figure [T(b)j Let = 
3^2 = ^3 = {0,1,2,3}, and suppose node vi wants to compute f(Xi,X2,XT,) = {Xi +X2 + 
X^)mod4. It is easy to check that (2,0,2) and (2,2,0) are feasible rate vectors for {li.,l2,h)- 
These are rate vectors associated with the two tree subgraphs. Further, one can also check that 
(2,1,1) is 

1 ) Outer bound on the rate region: Consider any cut of the graph G which partitions nodes 
into subsets S and y\S with vi Ey\S. Let 5+ (5) be the set of edges from some node in S to 
some node in y\S. 

Lemma 3: Consider a set of encoders which achieve error free block function computation 



with rate vector {Rwc{iJ)}(vi,vj)eS'- Given any assignments and of the nodes in S, if there 
exists an assignment Xy\^s ^^^^ '^hat f'''^\xy\^g,x^) ^ f^^\xy\^g,^), then the encoders on at least 
one of the edges in 5+ (5) must separate x^ and x^. 

(i) In the worst case block computation scenario, an outer bound on the rate region is given 
by 

£ Rij > log |n(^5i ) I for all cuts (5, r \ 5) , 

where is the partition of into the appropriate equivalence classes. 

(ii) Suppose we have a probability distribution with p{Xi,X2j . . . > 0. Given a cut (5, 

let i? C \ S be the subset of nodes which have a directed path to some node in 5. In the 
average case block computation scenario, an outer bound on the rate region is given by 

£ Rij > H{[Xs] \Xr) for all cuts (S, ^ \ 5) , 

{vi,vj)e8+iS) 

where [Xs] \Xj{ is the equivalence class to which Xs belongs, given Xj^ and a particular 
function. 

2) Achievable region: 

Lemma 4: Consider any directed tree subgraph Gt with root node vi. Let us suppose that 
only the edges in Gj can be used for communication. Then we can construct encoders on each 
edge, which minimize worst case or average case complexity. The rate vector corresponding to 
a tree Gj is the limit of the rate vectors for the optimal finite block length encoders for Gt- 
Thus, for a given tree Gt'- 

(i) The worst case rate vector corresponding to the tree Gt is an extreme point of the worst 
case rate region M^c - 

(ii) If p{xi,X2, - - - ,Xn) >0 for all {xi.xj, - - - ,Xn), the rate vector corresponding to the tree Gt 
is an extreme point of the average case rate region Mavg- 

The convex hull of the rate points corresponding to trees is achievable. However, we do not 
know if these are the only extreme points of the rate region 

3) Some examples: 



Example 3 (Arithmetic Sum): Consider three nodes vi, V2, V3 connected as in Figure [T(b)j Let 
^2 = ^3 = {0, 1}, with node vi having no measurements. Suppose node vi wants to compute 
/(Xi,X2,X3) =X2+X^. Let (i?2i,^3i,^32) be the rate vector associated with edges (Zi,Z2,/3). 
The outer bound on is: 

^21 >1; i?21+^31 >l0g3; i?32+^31>l- 

The subset of the rate region achievable by trees is: 

i?2i =A + (l-A)log3,i?3i =A,7?32 = for 0< A < L 

Suppose that Xi,X2 are i.i.d. with p{Xi = 0) = p{Xi = 1) = 0.5. The outer bound on ^avg is: 

R2i>i; R2i+R3i>^; ^32+^31 >l- 

The subset of the rate region achievable by trees is: 

i?2i =A + (1-A)^,i?3i =A,i?32 = (l-A) forO< A < 1. 

Example 4 (Finite field parity): Let = {0, 1, . . . ,D — 1} for each node v/. Suppose the 
collector node vi wants to compute the function (Xi +X2 + . . . +X„) mod D. In this case, the 
outer bound on the worst case rate region described in Lemma |3] is tight. Indeed, since the set 
of all outgoing links from a node is a valid cut, we have Y.{vi,vj)e<g'Rij — ^^8,2^ 

An obvious achievable strategy is for every leaf node v,- to split its block and transmit it on the 
outgoing edges from v,. Next, we move to a node at height 1. This node receives partial blocks 
from various leaf nodes, and can hence compute an intermediate parity for some instances of 
the block. It then splits its block along the various outgoing edges. The crucial point is that the 
worst case description length per instance remains log2/). Proceeding recursively up the DAG, 
we see that we can achieve the outer bound. 

Example 5 (Max/Min): Let <^ = {0, 1, . . . ,D— 1} for each node v,. Suppose the collector node 
vi wants to compute max{X\,X2, . . . The outer bound to the worst case rate region described 



in Lemma [3] is tight. The achievable strategy is similar to the parity case, where nodes compute 
intermediate maximum values and split their blocks on the outgoing edges. Once again, we utilize 
the fact that the range of the Max function remains constant irrespective of the number of nodes. 

IV. Computing Symmetric Boolean Functions in Undirected Graphs 

In this section, we address the problem of symmetric Boolean function computation in an 
undirected graph. Each node has a Boolean variable and all nodes want to compute a given 
symmetric Boolean function. As in Section Hill we adopt a deterministic framework and consider 
the problem of worst case block computation. Further, since the graph is undirected, the set of 
admissible strategies includes all interactive strategies, where a node may exchange several 
messages with other nodes, with node fs transmission being allowed to depend on all previous 
transmissions heard by node z, and node fs block of measurements. This is in contrast with the 
problem studied in Section Hill 

We begin by reviewing a toy problem from []9l where the exact communication complexity 
of the AND function of two variables is shown to be log2 3 bits, for block computation. In 
Section |IV-A[ we generalize this approach to the two node problem, where each node / has an 
integer variable X, and both nodes want to compute a function /(Xi.Xj) which only depends 
on Xi +X2. We derive an optimal single-round strategy for the class of sum-threshold functions, 
which evaluate to 1 if Xi +X2 exceeds a threshold, and an approximate strategy for the class of 
sum-interval functions, which evaluate to 1 if a <Xi +X2 < b, the upper and lower bounds do 
not match. The general achievable strategy involves separation of the source alphabet, followed 
by coding, and can be used for any general function. 

In Section HV-Bl we consider symmetric Boolean function computation on trees. Since every 
edge is a cut-edge, we can obtain a cut-set lower bound for the number of bits that must be 
exchanged on an edge, by reducing it to a two node problem with general alphabets. For the class 
of sum-threshold functions, we are able to match the cut-set bound by constructing an achievable 
strategy that is reminiscent of message passing algorithms. In Section HV-Dl for general graphs. 



we can still derive a cut-set lower bound by considering all partitions of the vertices. We also 
propose an achievable scheme that consists of activating a subtree of edges and using the optimal 
strategy for transmissions on the tree. While the upper and lower bounds do not match even 
for very simple functions, for complete graphs, we show that aggregation along trees provides 
a 2-OPT solution. 

A. The two node problem 

Consider two nodes 1 and 2 with variables X\ E {0,1,..., mi} and X2 E {0, 1, . . . ,^2}. Both 
nodes wish to compute a function f{Xi,X2) which only depends on the value of Xi +X2. To 
put this in context, one can suppose there are mi Boolean variables collocated at node 1 and 
m2 Boolean variables at node 2, and both nodes wish to compute a symmetric Boolean function 
of the n : = m\ + m2 variables. We pose the problem in a block computation setting, where each 
node i has a block of A'^ independent measurements, denoted by Xf . We consider the class of all 
interactive strategies, where nodes 1 and 2 transmit messages altemately with the value of each 
subsequent message being allowed to depend on all previous transmissions, and the block of 
measurements available at the transmitting node. We define a round to include one transmission 
by each node. A strategy is said to achieve correct block computation if for every choice of 
input each node i can correctly decode the value of the function block f^{X\,X2) 

using the sequence of transmissions bi,b2,... and its own measurement block Xf^. 

Let rY{^ be the set of strategies for block length A'^, which achieve zero-error block computation, 
and let C{f,Sf^,N) be the worst-case total number of bits exchanged under strategy E S^n. 
The worst-case per-instance complexity of computing a function /(Xi,X2) is defined as 

Cif) := lim mm . 

N^ooSi^e-yi^ N 

1) Complexity of sum-threshold functions: In this paper, we are only interested in functions 
f{Xi,X2) which only depend on Xi +X2. Let us suppose without loss of generality that mi < m2. 
We define an interesting class of {0, l}-valued functions called sum-threshold functions. 

Definition 4 (sum-threshold functions): A sum-threshold function IIq{Xi,X2) with threshold 



is defined as follows: 

1 ifXi+X2>e, 



otherwise. 



For the special case where mi = l,m2 = 1 and = 2, we recover the Boolean AND function, 
which was studied in dH. It is critical to understand this problem before we can address 
the general problem of computing symmetric Boolean functions. Consider two nodes with 
measurement blocks X^ G {0, 1 }^ and G {0, 1 which want to compute the element- wise 
AND of the two blocks, denoted by A^(Xi,X2). 

Theorem 5: Given any strategy Si^ for block computation of Xi AX2, 

C{XiAX2,SN,N)>mog23. 

Further, there exists a strategy 5^ which satisfies 

C(ZiAX2,5^,A^)< [A^logjS]. 

Thus, the complexity of computing Xi AX2 is given by C{Xi AX2) = log23. 
Proof of achievability: Suppose node 1 transmits first using a prefix-free codebook. Let the 
length of the codeword transmitted be 1{X^). At the end of this transmission, both nodes know 
the value of the function at the instances where Xi = 0. Thus node 2 only needs to indicate its 
bits for the instances of the block where Xi = I. Thus the total number of bits exchanged under 
this scheme is 1{X^) +w{X^), where w{X^) is the number of Is in X^ . For a given scheme, 
let us define 

L:=max(/(Xf)+w(Xn), 

to be the worst case total number of bits exchanged. We are interested in finding the codebook 
which will result in the minimum worst-case number of bits. 

Any prefix-free code must satisfy the Kraft inequality given by ^2^'^^i ^ < 1. Consider a 

codebook with 1{X^) = \N\og23'] — w(jc^). This satisfies the Kraft inequality since Lxf ^(^f) = 
3^. Hence there exists a valid prefix free code for which the worst case number of bits exchanged 



is [A^log2 3], which establishes that C{Xi AX2) < logjS. 

The lower bound is shown by constructing a fooling set flSl of the appropriate size. We 
digress briefly to introduce the concept of fooling sets in the context of two-party communication 
complexity [|5||. Consider two nodes X and Y, each of which take values in finite sets ^ and 
and both nodes want to compute some function f{X,Y) with zero error. 

Definition 5 (Fooling Set): A set E C x^V is said to be a fooling set, if for any two distinct 
elements (^2, J2) in E, we have either 

• fi^uyi) ^/(xi^yi), or 

• f{xi,yi)=fix2,y2), but either f{xuy2)^f{xuyi) or f{x2,yi) ^ fixi,yi). 

Given a fooling set E for a function /(Xi,X2), we have C(/(Xi,X2)) > log2|£'|. We have 
described two dimensional fooling sets above. The extension to multi-dimensional fooling sets 
is straightforward and gives a lower bound on the communication complexity of the function 

f{Xl,X2,...,Xn). 

Lower bound for Theorem 13 We define the measurement matrix M to be the matrix obtained 
by stacking the row X^ over the row X^. Thus we need to find a subset of the set of all 
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which are made up of only the column vectors { 

L J L 1 J |_ 1 

appropriate fooling set. Consider two distinct measurement matrices M\,M2EE. Let f^{Mi) and 
f^{M2) be the block function values obtained from these two matrices. If f^{Mi) 7^ f^{M2), 
we are done. Let us suppose f^{M\) = f^{M2) and since Mi y^M2, there must exist one column 



where M\ has 




1 



but M2 has 



1 




Now if we replace the first row of M\ with the first 

row of M2, the resulting measurement matrix, say M* is such that f{M*) 7^ /(Mj). Thus, the set 
£■ is a valid fooling set. It is easy to verify that the E has cardinality 3^. Thus, for any strategy 
Sn e Yn, we must have C{Xi AX2,Sn,N) > A^log23, implying that C{Xi AX2) > log23. This 
concludes the proof of Theorem |5] □ 

We now return to the general two node problem with Xi G {0, 1 , . . . , mi } and X2 G {0, 1 , . . . , m2} 
and the sum-threshold function IIq{Xi,X2). We will extend the approach presented above to this 



general scenario. 

Theorem 6: Given any strategy Sn for block computation of the function Ile{Xi,X2), 

C{ne{Xi,X2),SN,N) > A^log2{min(20 + l,2mi +2,2(n- + 1) + 1)}. 

Further, there exist single-round strategies 5^ and S^, starting with nodes 1 and 2 respectively, 
which satisfy 

C{ne{XuX2),S*^,N) < [iVlog2{min(20 + l,2mi+2,2(n-0 + l) + l)}l. 

C{ne{XuX2),S*N\N) < [iVlog2{min(20 + l,2mi +2, 2(/i - + 1) + 1)}] . 

Thus, the complexity of computing ne(Xi,X2) is given by C{Ile{Xi,X2)) = log2{min(20 + 

l,2mi+2,2(n-0 + l) + l)}. 

Proof of achievability: We consider three cases: 

(a) Suppose 6 <m\ <m2. We specify a strategy 5^ in which node 1 transmits first. We begin 
by observing that inputs Xi = 6,Xi = {6 + l)...,Xi = mi need not be separated, since 
for each of these values of Xy, Y\q{X\,X2) = 1 for all values of X2. Thus node 1 has an 
effective alphabet of {0,1,..., 0}. Suppose node 1 transmits using a prefix-free codeword 
of length 1{X^). At the end of this transmission, node 2 only needs to indicate one bit for 
the instances of the block where Xi = 0, 1, . . . , (0 — 1). Thus the worst-case total number 
of bits is 

L := max(/(Xf ) + wO(Xf ) +w\X^) + . . . + w^-\X^)), 

where w^{X^) is the number of instances in the block where X\ = j. We are interested in 
finding the codebook which will result in the minimum worst-case number of bits. From 
the Kraft inequality for prefix-free codes we have 

^ 2-L+w°{X^')+w^{X^)+...+w<^-^{X^)) <- ^ 

x^e{o,i,....e}'^ 

Consider a codebook with 1{X^) = [A'^log2(20 + 1)] — w(x^). This satisfies the Kraft in- 



equality since 

xfe{o,i,..,e}'v 

Hence there exists a prefix-free code for which the worst-case total number of bits exchanged 
is \Nlog2{26 + 1)1 . Since 9 <m\ < m2, we have 

Cine{Xi,X2),S%,N) < [A^log2{min(20 + l,2mi -f2,2(n - + 1) + 1)}]. 

The strategy S'^ starting at node 2 can be similarly derived. Node 2 now has an effective 
alphabet of {0,1,..., 0}, and we have C(n0(Xi,X2),S]7,A^) < [A^log2(20 + 1)] . 

(b) Suppose mi < m2 < 0. Consider a strategy in which node 1 transmits first. The inputs 
Xi = 0,Xi = 1, . . . ,Xi = — m2 — 1 need not be separated since for each of these values 
of Xi, ne(Xi,X2) = for all values of X2. Thus node 1 has an effective alphabet of {0 — 
m2 — 1, — m2, . . . ,mi}. Upon hearing node I's transmission, node 2 only needs to indicate 
one bit for the instances of the block where Xi = — m2, . . . , mi . Consider a codebook with 
/(Xf ) = [iVlog2(2(mi +m2 - + 1) + 1)] -w^-'"^{X^) - . . . - w'"> (Xf ). This satisfies the 
Kraft inequality and we have L = \N\og2{2{n — + 1) + 1)] . Since mi < m2 < 0, we have 
that 

C{nQ{Xi,X2),S*^,N) < [A^log2{min(20 + l,2mi +2, 2(n - + 1) + 1)}]. 
The strategy 5^* starting at node 2 can be analogously derived. 

(c) Suppose m\ < 9 < m2. For the case where node 1 transmits first, we construct a trivial 
strategy S*^ where node 1 uses a codeword of length [A^log2(mi + 1)] bits and node 2 replies 
with a string of N bits indicating the function block. Thus we have C{IIq{X\,X2) ^S^^N) < 
[A^log2(2mi+2)l. 

Now consider a strategy where node 2 transmits first. Observe that the inputs X2 = 0,^2 = 
1 , . . . ,^2 = — mi — 1 need not be separated since for each of these values of X2, He (^1 ,^2) = 



for all values of X2. Further, the inputs X2 = 0,X2 = + 1, . . . ,^2 = m2 need not be separated. 
Thus node 1 has an effective alphabet of {0— mi — 1,0— mi,..., 0}. Upon hearing node 2's 
transmission, node 1 only needs to indicate one bit for the instances of the block where X2 = 
- mi, . . . , - 1. Consider a codebook with 1{X^) = [A^log2(2mi + 2)] - w^"'"' (Xf ) - . . . - 
w^^^{X^). This satisfies the Kraft inequality and we have L = \Nlog2{2{n — + 1) + 1)] . Since 
m\ < < m2, we have that 



C{Ue{XuX2)XN\N) < [A^log2{min(20 + l,2mi+2,2(n-0 + l) + l)}]. 

The lower bound is shown by constructing a fooling set as before. Let E denote the set of all 
measurement matrices which are made up only of the column vectors from the set 



: < zi < mi,0 < Z2 < m2, (0 - 1) < zi +Z2 < 



We claim that E is the appropriate fooling set. Consider two distinct measurement matrices 
A/i,M2 G E. Let f^{M\) and f^{M2) be the block function values obtained from these two 
matrices. If f^[M{) ^ f^{M2), we are done. Let us suppose /^(Mi) = f^{M2), and note that 
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Zi 




Z2 



since Mi ^ M2, there must exist one column where Mi and M2 differ. Suppose Mi has 
while M2 has 



Z2a 



Zlb 
Z2b 



where zia + Z2a = zib + Z2b- Assume without loss of generality that 

Zla < Zlb and Z2a > Z2b- 

• If Zia + Z2a = zib+Z2b = 9-1, then the diagonal element f{zib,Z2a) = 1 since zih + Z2a > 0- 
Thus, if we replace the first row of Mi with the first row of M2, the resulting measurement 
matrix, say A/*, is such that f{M*) ^ /(Mi). 

• If Zia+Z2a = Zlb + Z2b = d, then the diagonal element f{zia,Z2b) = since zib+Z2a < 0- 
Thus, if we replace the second row of Mi with the second row of M2, the resulting matrix 
M* is such that /(M*) ^/(Mi). 



Thus, the set £ is a valid fooling set with cardinality |Z|^. For any strategy Sp^, we have 
C{f,Si\/,N) > A'^log2 |Z|. The cardinality of Z can be modeled as the sum of the coefficients of 
and y^^i in a carefully constructed polynomial: 

{1 +Y + + + Y + ... +7""^) 

This is solved using the binomial expansion for ,^ J^w. ETl . 
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(a) Suppose 9 <mi< m^. Then |Z| = + + 1. 

(b) Suppose nil < 6 < nii- Then |Z| = 2mi +2. 

(c) Suppose mi <m2< 9. Then |Z| = 2(n — + 1) + 1. 
This completes the proof of Theorem |6l □ 

2) Complexity of sum-interval functions: 

Definition 6 (sum-interval functions): A sum-interval function n[Q^](Xi,X2) on the interval 
[a^b] is defined as follows: 



n[a,^,i(Xi,X2) : = 



1 if a<Xi+X2<b, 
otherwise. 



Theorem 7: Given any strategy Sn for block computation of n[a^](Xi,X2) where b <n/2, 
C{n[^^h]{Xi,X2),SN,N)>Nlog2{mm{2b-a + 3,mi + l)}. 
Further, there exists a single-round strategy 5^ which satisfies 

C{n[,^h}{Xi,X2),S*^,N) < rA^log2{min(2(Z7+l) + l,2mi+2)}l. 
Thus, we have obtained the complexity of computing IIq{Xi,X2) to within one bit. 



Proof of Achievability: 

(a) Suppose b<mi< m^. Node 1 has an effective alphabet of {0, 1, + 1}. Then the worst- 
case total number of bits exchanged is given by 

L := max(/(Xf') + ) + . . . + )). 

From the Kraft inequality, we can obtain a prefix free codebook with L = [A'^log2(2Z7+ 1) + 
1)]. Thus we have 

C(n[„,,](Xi,X2),5;^,iV) < rA^log2(2(Z7+l) + l)l. 

(b) Suppose m\<a<h< m2 ox a <m\ <b < m2. In either of these scenarios, node 1 has an 
effective alphabet of {0,1,..., mi}. Then the worst-case total number of bits exchanged is 
given by 

L := max(/(Xi^) + w°-'"2(xf ) + . . . -f w™' {X^)) 

From the Kraft inequality, we can obtain a prefix free codebook with L = [A'^log2(2mi +2)] . 

Thus we have C{n[,,^h]{XuX2),S%,N) < [A^log2(2mi +2)] . 
Proof of Lower Bound: We attempt to find a fooling subset E of the set of measurement 
matrices. Our first guess would be the set of measurement matrices which are composed of 
only column vectors which sum up to or ^ + 1 . However we see that this is not necessarily 
a fooling set, because if [zia,Z2aV [zibTZ2bV ^"^^ columns which sum to b + l, and if 
zia ^ z\b — {b — a + 2), then neither of the diagonal elements evaluate to function value 1. Thus, 
we can pick a maximum of {b — a + 2) consecutive elements along the line zi+Z2 = b+l, and, 
as before, all the elements on the line zi+Z2 = b. It is easy to check that this modified set of 
columns indeed yields a fooling set of measurement matrices. Now we need to compute the 
number of such columns. 

(a) Suppose b <mi < m2. The number of columns which sum up to b is equal io b-\-l. Thus 
the size of the fooling set is given by {lb — a + 3)^ . 

(b) Suppose a < m\ <b <m2 ox m\ < a <b < m2. The number of columns which sum up to 



b is equal to mi + 1 and the number of columns which sum up to & + 1 is equal to mi + 1. 

Thus, the size of the fooling set is given by {(mi + 1) +min(mi + l,b — a + 2)}^. 
3) A general strategy for achievability: The strategy for achievability used in Theorems |6] 
and |7] suggests an achievable scheme for any general function f{X\,X2) of variables Xi E ^\ 
and X2 G which depends only on the value of Xi +X2. This is done in two stages. 

• Separation: Two inputs x\a and x\h need not be separated if f{x\a.,X2) = f{xib,X2) for all 
values X2- By checking this condition for each pair {x\a,x\b), we can arrive at a partition 
of {0, 1 . . . ,mi} into equivalence classes, which can be considered a reduced alphabet, say 
A := {ai,...,a/}. 

• Coding: Let Aq denote the subset of the alphabet A for which the function evaluates only to 
0, irrespective of the value of X2, and let Ai denote the subset of A which always evaluates 
to 1. Clearly, from the equivalence class structure, we have |Ao| < 1 and |Ai| < 1. Using 
the Kraft inequality as in Theorems [6] and |71 we obtain a scheme 5]^ with complexity 
log2(2/-|Ao|-|Ai|). 

B. Computing symmetric Boolean functions on tree networks 

Consider a tree graph T = {V,E), with node set V = {0, 1, . . . ,n} and edge set E. Each node i 
has a Boolean variable Xi E {0, 1}, and every node wants to compute a given symmetric Boolean 
function /(Xi,X2, . . . ,X„). Again, we allow for block computation and consider all strategies 
where nodes can transmit in any sequence with possible repetitions, subject to: 

• On any edge e = {i,j), either node i transmits or node j transmits, or neither, and this is 
determined from the previous transmissions. 

• Node i's transmission can depend on the previous transmissions and the measurement block 

xr. 

For sum-threshold functions, we have a computation and communication strategy that is optimal 
for each link. 

Theorem 8: Consider a tree network where we want to compute the function IIq{Xi, . . . 
Let us focus on a single edge e = {i,j) whose removal disconnects the graph into components 



Ag and V\Ae, with \Ae\ < \V\Ae\. For any strategy G S^n, the number of bits exchanged 
along edge e = (ij), denoted by Ce{Tle{X\j. . . ,Xn),Spj,N), is lower bounded by 

C,(ne(Xi, . . .,X„),Sn,N) > iVlog2{min(20 + 1,2|A,| +2,2(n - + 1) + 1)}. 

Further, there exists a strategy such that for any edge e, 

C,(ne(Xi,...,X„),S^,A^) < [A^log2{min(20 + l,2|A,|+2,2(n-0 + l) + l)}l. 

The complexity of computing IIq{Xi,. . . is given by 

C,(n0(Xi,...,X,O)=log2{min(20 + l,2|A,|+2,2(n-0 + l) + l)}. 

Proof: Given a tree network T, every edge e is a cut edge. Consider an edge e whose removal 
creates components Ae and V\Ae, with \Ae\ < \V\Ae\. Now let us aggregate the nodes in A^ and 
also those in V\Ae, and view this as a problem with two nodes connected by edge e. Clearly 
the complexity of computing the function Ile{XA^,Xy\^Aj is a lower bound on the worst-case 
total number of bits that must be exchanged on edge e under any strategy Sn- Hence we obtain 

C,(ne(Xi, . . . ,X„),5a,,A^) > A^log2{min(20 + 1,2|A,| +2,2(n - + 1) + 1)}. 

The achievable strategy 5^ is derived from the achievable strategy for the two node case in 
Theorem |6l While the transmissions back and forth along any edge will be exactly the same, 
we need to orchestrate these transmissions so that conditions of causality are maintained. Pick 
any node, say r, to be the root. This induces a partial order on the tree network. We start with 
each leaf in the network transmitting its codeword to the parent. Once a parent node obtains 
a codeword from each of its children, it has sufficient knowledge to disambiguate the letters 
of the effective alphabet of the subtree, and subsequently it transmits a codeword to its parent. 
Thus codewords are transmitted from child nodes to parent nodes until the root is reached. The 
root can then compute the value of the function and now sends the appropriate replies to its 
children. The children then compute the function and send appropriate replies, and so on. This 



sequential strategy depends critically on the fact that, in the two node problem, we derived 
optimal strategies starting from either node. For any edge e, the worst-case total number of bits 
exchanged is given by 

C,(n0(Xi, . . . ,Xn),S*^,N) < [iVlog2{min(20 + 1,2|A,| + 2,2(n - + 1) + 1)}] .□ 

One can similarly derive an approximately optimal strategy for sum-interval functions, which 
we state here without proof. 

Theorem 9: Consider a tree network where we want to compute the function Tlj^ ^] {X\,... 
with b < ^. Let us focus on a single edge e = whose removal disconnects the graph into 
components Ag and V\Ag, with \Ae\ < \V\Ae\. For any strategy G -Yn, the number of bits 
exchanged along edge e = (ij), denoted by Ce{f,Sf^,N) is lower bounded by 

Ce{n[^,b]{Xh.--,Xn),SN,N) >Nlog2{mm{2b-a + 3, |A,| + 1)}. 

Further there exists a strategy S^^ such that for any edge e, 

Q(n[,,fe](Xi,...,X„),5j^,A^)< [A^log2{min(2(Z>+l) + l,2|A,|+2)}l. 

C. Extension to non-binary alphabets 

The extension to the case where each node draws measurements from a non-binary alphabet 
is immediate. Consider a tree network with n nodes where node i has a measurement G 
{0, 1, ...,/, — 1}. Suppose all nodes want to compute a given function which only depends on 
the value of X[+X2-\- . . .+Xn. We can define sum-threshold functions in analogous fashion and 
derive an optimal strategy for computation. 

Theorem 10: Consider a tree network where we want to compute a sum-threshold function, 
IIq{Xi, . . . ,X„), of non-binary measurements. Let us focus on a single edge e whose removal 
disconnects the graph into components Ag and V\Ae. Let us define Ia^^ '■=T.ieAe^i- Then the 
complexity of computing IIq{Xi,. . . ,X,i) is given by 



CeiUeiXu. . . ,X„)) = log2{min(20 + l,2min(/A,, /v\aJ + 2,2(/k - + 1) + 1)}. 



Theorem |9] also extends to the case of non-binary alphabets. 



D. Computing sum-threshold functions in general graphs 

We now consider the computation of sum-threshold functions in general graphs where the 
alphabet is not restricted to be binary. A cut is defined to be a set of edges F C E which 
disconnect the network into two components Af and V\Af. 

Lemma 5 (Cut-set bound): Consider a general network G = {V,E), where node / has mea- 
surement X/ G {0, 1 — 1 } and all nodes want to compute the function Uq {Xi , . . . , X„) . Given 
a cut F which separates Af from V\Af, the cut-set lower bound specifies that: For any strategy 
Sf^!, the number of bits exchanged on the edges in F is lower bounded by 

Cf {Tie (Xi , . . . , S^, A^) > A^logj (min{20 + 1 , 2m^ + 2, 2(/k - + 1 ) + 1 ) } • 

where Ia^ = ZieApU and mp = min(ZA^, 

A natural achievable strategy is to pick a spanning subtree of edges and use the optimal 
strategy on this subtree. The convex hull of the rate vectors of the subtree aggregation schemes, 
is an achievable region. We wish to compare this with the cut- set region. To simplify matters, 
consider a complete graph G where each node / has a measurement X,- G {0, 1}. Let Rack 
be the maximum symmetric ratepoint achievable by aggregating along trees, and Rem be the 
minimum symmetric ratepoint that satisfies the cut-set constraints. 

Theorem 11: For the computation of sum-threshold functions on complete graphs. Rack < 
2(1 — j^)Rcut- In fact, this approximation ratio is tight. 

Proof: Let us assume without loss of generality that < Consider all cuts of the type 



{{i}^y\{i})- This yields 




Now consider the achievable scheme which employs each of the n star graphs for equal sized 
sub-blocks of measurements. The rate on edge (?, j) is given by 

^ (min(log2(20 + 1), log2(2/, + 2)) + min(log2(20 + 1), log2(2/j + 2))) 

Hence we have 

Rack < -(min(log2(20 + l),max{log2(2/, + 2)})) < 2 ( 1 - - ) R^ut- 

n i<EV \ n J 

Tight Example: Suppose h = h = . . . = In = I and > I, then 

Rcut = ^min(log2(20 + l),log2(2Z + 2)) 
n — I 

Further, from the symmetry of the problem, it is clear that the optimal scheme is to employ the 
n star graphs for equal sub-blocks of measurements. This gives a symmetric achievable point of 

R^,h = -min(log2(20 + l),log2(2/ + 2)) = 2(1--) 
n \ n J 

E. Linear Programming Formulation 

The above approach of restricting attention to aggregation along star graphs, gives in to a 
convenient Linear Programming (LP) formulation. Consider a complete graph G. Let us define 
the rate region achievable by star graphs in the following way 

i%ach = {Al:m\i = \} 

where A is a n x "'-"^ matrix where a,eth entry is the minimum number of bits that must be sent 
along edge e under tree aggregation scheme 7]-. The vector A is the relative weights assigned to 
the different trees. We want to compare the rate vectors achieved by this scheme with the rate 
vectors that satisfy the cut constraints. Let r e ^cut be a given rate vector which satisfies the 
cut constraints of Lemma 1 . Now, we seek to find an achievable rate vector that is within a 9 
factor of r, and further, we want to find the minimum value of such a 9. This can formulated 
as a linear program 



Min. 
s.t. AA < Or 

ll^lli > 1 
A > 0, > 

Thus we can obtain the optimal assignment A* and the optimal factor 9*. Note that this 
assignment depends on the given rate vector r E ^cut- We can also write similar such LPs 
for other classes of trees. 

V. Concluding remarks 

In this paper, we have addressed some problems that arise in the context of information aggre- 
gation in sensor networks. While the general problem of devising optimal strategies for function 
computation in wireless networks appears formidable, we have simplified it by abstracting out the 
medium access control problem and analyzing the problem of function computation in graphs. 

We have started with the problem of zero error function computation in directed graphs, and 
analyzed both worst case and average case metrics. For directed tree graphs, we have constructed 
optimal encoding schemes on each edge. This matches the cut-set lower bounds. For general 
DAGs, we have provided an outer bound on the rate region, and an achievable region based 
on aggregating along subtrees. While we have presented some examples where tree aggregation 
schemes are optimal, it remains to quantify the sub-optimality of tree aggregation schemes in 
general. 

We have also addressed the computation of symmetric Boolean functions in undirected graphs, 
where all nodes want to compute the function. For the case of computing sum-threshold functions 
in undirected trees, we have derived the optimal strategy for each edge. The achievable scheme for 
block computation involves a layering of transmissions that is reminiscent of message passing. 
Our framework can be generalized to handle functions of integer measurements which only 
depend on the sum of the measurements. The extension to general graphs is very interesting and 
appears significantly harder. However, a cut-set lower bound can be immediately derived, and in 
some special cases one can show that subtree aggregation schemes provide a 2-OPT solution. 
Once again, it remains to study the suboptimality of tree aggregation schemes in general graphs. 

References 

[1] R. Ahlswede, N. Cai, S. R. Li, and R. W. Yeung. Network information flow. IEEE Transactions on Information Theory, 
46(4): 1204-1216, July 2000. 



[2] R. Appuswamy, M. Franceschetti, N. Karamchandani, and K. Zeger. Network coding for computing. In Proceedings of 

the 46th Annual AUerton Conference on Communication, Control, and Computing, pages 1-6, September 2008. 
[3] A. Giridhar and P. R. Kumar. Computing and communicating functions over sensor networks. IEEE Journal on Selected 

Areas in Communication, 23(4):755-764, April 2005. 
[4] S. Subramanian, R Gupta, and S. Shakkottai. Scaling bounds for function computation over large networks. In Proceedings 

of the IEEE International Symposium on Information Theory (ISIT), pages 136-140, June 2007. 
[5] E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University Press, New York, NY, USA, 1997. 
[6] I. Wegener. The Complexity of Boolean Functions. J. Wiley & Sons, Inc., New York, NY, USA, 1987. 
[7] A. Orlitsky and A. El Gamal. Average and randomized communication complexity. IEEE Transactions on Information 

Theory, 36:3-16, 1990. 

[8] M. Karchmer, R. Raz, and A. Wigderson. Super-logarithmic depth lower bounds via direct sum in communication coplexity. 

In Structure in Complexity Theory Conference, pages 299-304, 1991. 
[9] R. Ahlswede and Ning Cai. On communication complexity of vector-valued functions. IEEE Transactions on Information 

Theory, 40:2062-2067, 1994. 

[10] F. R. Kschischang, B. J. Frey, and H. Loeliger. Factor graphs and the sum-product algorithm. IEEE Transactions on 

Information Theor)', 47(2):498-519, February 2001. 
[11] S. Aji and R. Mceliece. The generalized distributive law. IEEE Transactions on Information Theory, 46(2):325-343, 2000. 
[12] A. D. Wyner and J. Ziv. The rate-distortion function for source coding with side information at the decoder. IEEE 

Transactions on Information Theory, 22(1): 1-10, January 1976. 
[13] A. Orlitsky and J. R. Roche. Coding for computing. IEEE Transactions on Information Theory, 47:903-917, 2001. 
[14] H. Witsenhausen. The zero-error side information problem and chromatic numbers. IEEE Transactions on Information 

Theory, 22:592-593, September 1976. 
[15] N. Alon and A. Orlitsky. Source coding and graph entropies. IEEE Transactions on Information Theory, 42:1329-1339, 

September 1996. 

[16] N. Ma and P. Ishwar. Two-terminal distributed source coding with alternating messages for function computation. In 
Proceedings of the IEEE International Symposium on Information Theory (ISIT), pages 51-55, 2008. 

[17] N. Ma, P. Ishwar, and P. Gupta. Information-theoretic bounds for multiround function computation in collocated networks. 
In Proceedings of the IEEE International Symposium on Information Theory (ISIT), pages 2306-2310, 2009. 

[18] R. D. Gallager. Finding parity in a simple broadcast network. IEEE Transactions on Information Theory, 34(2): 176-180, 
March 2008. 

[19] L. Ying, R. Srikant, and G. E. Dullerud. Distributed symmetric function computation in noisy wireless sensor networks. 

IEEE Transactions on Information Theory, 53(12):4826^833, December 2007. 
[20] C. Dutta, Y. Kanoria, D. Manjunath, and J. Radhakrishnan. A tight lower bound for parity in noisy communication 

networks. In Proceedings of the 20th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1056-1065, January 

2008. 

[21] D. West. Combinatorial Mathematics. Course notes for ECE 580, Department of Mathematics, University of Illinois at 
Urbana-Champaign, 2008. 



